"Express Mail" Mailing No: EV 339 773 914 US 
Date of Deposit: October 16, 2003 



PATENT 

Case No. 11336/592 (P03131USC) 



SYSTEM FOR SUPPRESSING WIND NOISE 



Inventors: 



Phil Hetherington 

Xueman Li 
Pierre Zakarauskas 



Attorney: James A. Collins 

Registration No. 43,557 

BRINKS HOFER GILSON & LIONE 

P.O. Box 10395 

Chicago, Illinois 60610 

(312) 321-4200 



"Express Mail" Mailing No: EV 339 773 914 US 
Date of Deposit: October 16, 2003 



PATENT 

Case No. 11336/592 (P03131USC) 



SYSTEM FOR SUPPRESSING WIND NOISE 

Inventors: 

Phil Hetherington 

Xueman Li 
Pierre Zakarauskas 

Priority Claim 

[001] This application is a continuation in-part of U.S. Application No. 10/410,736, 
"Method and Apparatus for Suppressing Wind Noise," filed April 10, 2003. The disclosure 
of the above application is incorporated herein by reference. 

Background Of The Invention 

1. Technical Field. 

[002] This invention relates to acoustics, and more particularly, to a system that 
enhances the perceptual quality of a processed voice. 

2. Related Art. 

[003] Many hands-free communication devices acquire, assimilate, and transfer a 
voice signal. Voice signals pass from one system to another through a communication 
medium. In some systems, including some used in vehicles, the clarity of the voice signal 
does not depend on the quality of the communication system or the quality of the 
communication medium. When noise occurs near a source or a receiver, distortion garbles 
the voice signal, destroys information, and in some instances, masks the voice signal so that it 
is not recognized by a listener. 

[004] Noise, which may be annoying, distracting, or results in a loss of information, 
may come from many sources. Within a vehicle, noise may be created by the engine, the 
road, the tires, or by the movement of air. A natural or artificial movement of air may be 
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heard across a broad frequency range. Continuous fluctuations in amplitude and frequency 
may make wind noise difficult to overcome and degrade the intelligibility of a voice signal. 

[005] Many systems attempt to counteract the effects of wind noise. Some systems 
rely on a variety of sound-suppressing and dampening materials throughout an interior to 
5 ensure a quiet and comfortable environment. Other systems attempt to average out varying 

wind-induced pressures that press against a receiver. These noise reducers may take many 
shapes to filter out selected pressures making them difficult to design to the many interiors of 
a vehicle. Another problem with some speech enhancement systems is that of detecting wind 
noise in a background of a continuous noise. Yet another problem with some speech 
10 enhancement systems is that they do not easily adapt to other communication systems that are 

susceptible to wind noise. 

[006] Therefore there is a need for a system that counteracts wind noise across a 
varying frequency range. 



Summary 

15 [007] A voice enhancement logic improves the perceptual quality of a processed 

voice. The system learns, encodes, and then dampens the noise associated with the 
movement of air from an input signal. The system includes a noise detector and a noise 
attenuator. The noise detector detects a wind buffet by modeling. The noise attenuator then 
dampens the wind buffet. 

20 Alternative voice enhancement logic includes time frequency transform logic, a 

background noise estimator, a wind noise detector, and a wind noise attenuator. The time 
frequency transform logic converts a time varying input signal into a frequency domain 
output signal. The background noise estimator measures the continuous noise that may 
accompany the input signal. The wind noise detector automatically identifies and models a 

25 wind buffet, which may then be dampened by the wind noise attenuator. 

[008] Other systems, methods, features and advantages of the invention will be, or 
will become, apparent to one with skill in the art upon examination of the following figures 
and detailed description. It is intended that all such additional systems, methods, features and 
advantages be included within this description, be within the scope of the invention, and be 

30 protected by the following claims. 



2 
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Brief Description Of The Drawings 

[009] The invention can be better understood with reference to the following 
drawings and description. The components in the figures are not necessarily to scale, 
emphasis instead being placed upon illustrating the principles of the invention. Moreover, in 
the figures, like referenced numerals designate corresponding parts throughout the different 
views. 

[010] Figure 1 is a partial block diagram of voice enhancement logic. 
[011] Figure 2 is noise that may be associated with wind and other sources in the 
frequency domain. 

[012] Figure 3 is a signal-to-noise ratio of the noise that may be associated with 
wind and other sources in the frequency domain. 

[013] Figure 4 is a block diagram of the voice enhancement logic of Figure 1. 

[014] Figure 5 is a pre-processing system coupled to the voice enhancement logic of 
Figure 1. 

[015] Figure 6 is an alternative pre-processing system coupled to the voice 
enhancement logic of Figure 1. 

[016] Figure 7 is a block diagram of an alternative voice enhancement system. 

[017] Figure 8 is noise that may be associated with wind and other sources in the 
frequency domain. 

[018] Figure 9 is a graph of a wind buffet masking a portion of a voice signal. 
[019] Figure 10 is a graph of a processed and reconstructed voice signal. 
[020] Figure 1 1 is a flow diagram of a voice enhancement. 
[021] Figure 12 is a partial sequence diagram of a voice enhancement. 
[022] Figure 13 is a partial sequence diagram of a voice enhancement. 
[023] Figure 14 is a block diagram of voice enhancement logic within a vehicle. 
[024] Figure 15 is a block diagram of voice enhancement logic interfaced to an 
audio system and/or a communication system. 

Detailed Description of the Preferred Embodiments 

[025] A voice enhancement logic improves the perceptual quality of a processed 
voice. The logic may automatically learn and encode the shape and form of the noise 
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associated with the movement of air in a real or a delayed time. By tracking selected 
attributes, the logic may eliminate or dampen wind noise using a limited memory that 
temporarily stores the selected attributes of the noise. Alternatively, the logic may also 
dampen a continuous noise and/or the "musical noise," squeaks, squawks, chirps, clicks, 
5 drips, pops, low frequency tones, or other sound artifacts that may be generated by some 

voice enhancement systems. 

[026] Figure 1 is a partial block diagram of the voice enhancement logic 100. The 
voice enhancement logic may encompass hardware or software that is capable of running on 
one or more processors in conjunction with one or more operating systems. The highly 

10 portable logic includes a wind noise detector 102 and a noise attenuator 104. 

[027] In Figure 1 the wind noise detector 102 may identify and model a noise 
associated with wind flow from the properties of air. While wind noise occurs naturally or 
may be artificially generated over a broad frequency range, the wind noise detector 102 is 
configured to detect and model the wind noise that is perceived by the ear. The wind noise 

15 detector receives incoming sound, that in the short term spectra, may be classified into three 

broad categories: (1) unvoiced, which exhibits noise-like characteristics that includes the 
noise associated with wind, i.e., it may have some spectral shape but no harmonic or formant 
structure; (2) fully voiced, which exhibits a regular harmonic structure, or peaks at pitch 
harmonics weighted by the spectral envelope that may describe the formant structure, and (3) 

20 mixed voice, which exhibits a mixture of the above two categories, some parts containing 

noise-like segments, the rest exhibiting a regular harmonic structure and/or a formant 
structure. 

[028] The wind noise detector 102 may separate the noise-like segments from the 
remaining signal in a real or in a delayed time no matter how complex or how loud an 

25 incoming segment may be. The separated noise-like segments are analyzed to detect the 

occurrence of wind noise, and in some instances, the presence of a continuous underlying 
noise. When wind noise is detected, the spectrum is modeled, and the model is retained in a 
memory. While the wind noise detector 102 may store an entire model of a wind noise 
signal, it also may store selected attributes in a memory. 

30 [029] To overcome the effects of wind noise, and in some instances, the underlying 

continuous noise that may include ambient noise, the noise attenuator 104 substantially 
removes or dampens the wind noise and/or the continuous noise from the unvoiced and 
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mixed voice signals. The voice enhancement logic 100 encompasses any system that 
substantially removes or dampens wind noise. Examples of systems that may dampen or 
remove wind noise include systems that use a signal and a noise estimate such as (1) systems 
which use a neural network mapping of a noisy signal and an estimate of the noise to a noise- 

5 reduced signal, (2) systems which subtract the noise estimate from a noisy-signal, (3) systems 

that use the noisy signal and the noise estimate to select a noise-reduced signal from a code- 
book, (4) systems that in any other way use the noisy signal and the noise estimate to create a 
noise-reduced signal based on a reconstruction of the masked signal. These systems may 
attenuate wind noise, and in some instances, attenuate the continuous noise that may be part 

10 of the short-term spectra. The noise attenuator 104 may also interface or include an optional 

residual attenuator 106 that removes or dampens artifacts that may result in the processed 
signal. The residual attenuator 106 may remove the "musical noise," squeaks, squawks, 
chirps, clicks, drips, pops, low frequency tones, or other sound artifacts. 

[030] Figure 2 illustrates exemplary noise associated with three wind flows. The 

15 wind buffets 202, 204, and 206, which are the events of wind striking a detector, vary by their 

level of severity or amplitude. The amplitudes reflect the relative differences in power or 
intensity between the fluctuations of air pressure received across an input area of a receiver or 
a detector. The line underlying the wind buffets illustrates the continuous noise 208 that is 
also sensed by the receiver or detector. In a vehicle, wind buffets may represent the natural 

20 flow of air through a window, through an open top of a convertible, through an inlet, or the 

artificial movement of air caused by a fan or a heating, ventilating, and/or air conditioning 
system (HVAC). The continuous noise may represent an ambient noise or a noise associated 
with an engine, a powertrain, a road, tires, or other sounds. 

[031] In the time and frequency spectral domain, the continuous noise 208 and a 

25 wind buffet 202 may be curvilinear. The continuous noise and wind buffet may appear to be 

formed or characterized by the curved lines shown in Figure 2. However, when the signal 
strength (in decibels) of the wind buffet (e.g., a W E$) is related to the signal strength of a 
continuous noise (e.g., a C N)) in the signal-to-noise ratio (SNR) domain, the wind buffet 202 
may be characterized by a linear function with a vertical dimension corresponding to decibels 

30 and a horizontal dimension corresponding to frequency. This relation may be expressed as: 

SNR = a wb - a C n (Equation 1) 



5 
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Any method may approximate the linearity of a wind buffet. In the signal-to-noise domain, 
an offset or y-intercept 302 and an x-intercept or pivot point may characterize the linear 
model 302. Alternatively, an x or y-coordinate and a slope may model the wind buffet. In 
Figure 3, the linear model 302 descends in a negative slope. 
5 [032] Figure 4 is a block diagram of an example wind noise detector 102 that may 

receive or detect an unvoiced, fully voiced, or a mixed voice input signal. A received or 
detected signal is digitized at a predetermined frequency. To assure a good quality voice, the 
voice signal is converted to a pulse-code-modulated (PCM) signal by an analog-to-digital 
converter 402 (ADC) having any common sample rate. A smooth window 404 is applied to a 

10 block of data to obtain the windowed signal. The complex spectrum for the windowed signal 

may be obtained by means of a fast Fourier transform (FFT) 406 that separates the digitized 
signals into frequency bins, with each bin identifying an amplitude and phase across a small 
frequency range. Each frequency bin may then be converted into the power-spectral domain 
408 and logarithmic domain 410 to develop a wind buffet and continuous noise estimate. As 

15 more windows of sound are processed, the wind noise detector 102 may derive average noise 

estimates. A time-smoothed or weighted average may be used to estimate the wind buffet 
and continuous noise estimates for each frequency bin. 

[033] To detect a wind buffet, a line may be fitted to a selected portion of the low 
frequency spectrum in the SNR domain. Through a regression, a best-fit line may measure 

20 the severity of the wind noise within a given block of data. A high correlation between the 

best-fit line and the low frequency spectrum may identify a wind buffet. Whether or not a 
high correlation exists, may depend on a desired clarity of a processed voice and the 
variations in frequency and amplitude of the wind buffet. Alternatively, a wind buffet may 
be identified when an offset or y-intercept of the best-fit line exceeds a predetermined 

25 threshold (e.g., > 3 dB). 

[034] To limit a masking of voice, the fitting of the line to a suspected wind buffet 
signal may be constrained by rules. Exemplary rules may prevent a calculated offset, slope, 
or coordinate point in a wind buffet model from exceeding an average value. Another rule 
may prevent the wind noise detector 102 from applying a calculated wind buffet correction 

30 when a vowel or another harmonic structure is detected. A harmonic may be identified by its 

narrow width and its sharp peak, or in conjunction with a voice or a pitch detector. If a vowel 
or another harmonic structure is detected, the wind noise detector may limit the wind buffet 
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correction to values less than or equal to average values. An additional rule may allow the 
average wind buffet model or its attributes to be updated only during unvoiced segments. If a 
voiced or a mixed voice segment is detected, the average wind buffet model or its attributes 
are not updated under this rule. If no voice is detected, the wind buffet model or each 
5 attribute may be updated through any means, such as through a weighted average or a leaky 

integrator. Many other rules may also be applied to the model. The rules may provide a 
substantially good linear fit to a suspected wind buffet without masking a voice segment. 

[035] To overcome the effects of wind noise, a wind noise attenuator 104 may 
substantially remove or dampen the wind buffet from the noisy spectrum by any method. 

10 One method may add the wind buffet model to a recorded or modeled continuous noise. In 

the power spectrum, the modeled noise may then be subtracted from the unmodified 
spectrum. If an underlying peak or valley 902 is masked by a wind buffet 202 as shown in 
Figure 9 or masked by a continuous noise, a conventional or modified interpolation method 
may be used to reconstruct the peak and/or valley as shown in Figure 10. A linear or step- 

15 wise interpolator may be used to reconstruct the missing part of the signal. An inverse FFT 

may then be used to convert the signal power to the time domain, which provides a 
reconstructed voice signal. 

[036] To minimize the "music noise," squeaks, squawks, chirps, clicks, drips, pops, 
low frequency tones, or other sound artifacts that may be generated in the low frequency 

20 range by some wind noise attenuators, an optional residual attenuator 106 (shown in Figure 

1) may also condition the voice signal before it is converted to the time domain. The residual 
attenuator 106 may track the power spectrum within a low frequency range (e.g., less than 
about 400 Hz). When a large increase in signal power is detected an improvement may be 
obtained by limiting or dampening the transmitted power in the low frequency range to a 

25 predetermined or calculated threshold. A calculated threshold may be equal to, or based on, 

the average spectral power of that same low frequency range at an earlier period in time. 

[037] Further improvements to voice quality may be achieved by pre-conditioning 
the input signal before the wind noise detector processes it. One pre-processing system may 
exploit the lag time that a signal may arrive at different detectors that are positioned apart as 

30 shown in Figure 5. If multiple detectors or microphones 502 are used that convert sound into 

an electric signal, the pre-processing system may include control logic 504 that automatically 
selects the microphone 502 and channel that senses the least amount of noise. When another 
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microphone 502 is selected, the electric signal may be combined with the previously 
generated signal before being processed by the wind noise detector 102. 

[038] Alternatively, multiple wind noise detectors 102 may be used to analyze the 
input of each of the microphones 502 as shown in Figure 6. Spectral wind buffet estimates 
may be made on each of the channels. A mixing of one or more channels may occur by 
switching between the outputs of the microphones 502. The signals may be evaluated and 
selected on a frequency-by- frequency basis until the frequency of the pivot point 304 (shown 
in Figure 3) is reached. Alternatively, control logic 602 may combine the output signals of 
multiple wind noise detectors 102 at a specific frequency or frequency range through a 
weighting function. When the frequency of the pivot point is exceeded, the process may 
continue or a standard adaptive beam forming method may be used. 

[039] Figure 7 is alternative voice enhancement logic 700 that also improves the 
perceptual quality of a processed voice. The enhancement is accomplished by time- 
frequency transform logic 702 that digitizes and converts a time varying signal to the 
frequency domain. A background noise estimator 704 measures the continuous or ambient 
noise that occurs near a sound source or the receiver. The background noise estimator 704 
may comprise a power detector that averages the acoustic power in each frequency bin. To 
prevent biased noise estimations at transients, a transient detector 706 disables the noise 
estimation process during abnormal or unpredictable increases in power. In Figure 7, the 
transient detector 706 disables the background noise estimator 704 when an instantaneous 
background noise B(f t i) exceeds an average background noise B (f)A Ve by more than a 
selected decibel level c. ' This relationship may be expressed as: 

B(f,i)> B (f) Ave + c (Equation 2) 

[040] To detect a wind buffet, a wind noise detector 708 may fit a line to a selected 
portion of the spectrum in the SNR domain. Through a regression, a best-fit line may model 
the severity of the wind noise 202, as shown in Figure 8. To limit any masking of voice, the 
fitting of the line to a suspected wind buffet may be constrained by the rules described above. 
A wind buffet may be identified when the offset or y-intercept of the line exceeds a 
predetermined threshold or when there is a high correlation between a fitted line and the 
noise associated with a wind buffet. Whether or not a high correlation exists, may depend on 
a desired clarity of a processed voice and the variations in frequency and amplitude of the 
wind buffet. 

8 
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[041] Alternatively, a wind buffet may be identified by the analysis of time varying 
spectral characteristics of the input signal that may be graphically displayed on a 
spectrograph. A spectrograph may produce a two dimensional pattern called a spectrogram 
in which the vertical dimensions correspond to frequency and the horizontal dimensions 
5 correspond to time. 

[042] A signal discriminator 710 may mark the voice and noise of the spectrum in 
real or delayed time. Any method may be used to distinguish voice from noise. In Figure 7, 
voiced signals may be identified by (1) the narrow widths of their bands or peaks; (2) the 
resonant structure that may be harmonically related; (3) the resonances or broad peaks that 

10 correspond to formant frequencies; (4) characteristics that change relatively slowly with time; 

(5) their durations; and when multiple detectors or microphones are used, (6) the correlation 
of the output signals of the detectors or microphones. 

[043] To overcome the effects of wind noise, a wind noise attenuator 712 may 
dampen or substantially remove the wind buffet from the noisy spectrum by any method. 

15 One method may add the substantially linear wind buffet model to a recorded or modeled 

continuous noise. In the power spectrum, the modeled noise may then be removed from the 
unmodified spectrum by the means described above. If an underlying peak or valley 902 is 
masked by a wind buffet 202 as shown in Figure 9 or masked by a continuous noise, a 
conventional or modified interpolation method may be used to reconstruct the peak and/or 

20 valley as shown in Figure 10. A linear or step-wise interpolator may be used to reconstruct 

the missing part of the signal. A time series synthesizer may then be used to convert the 
signal power to the time domain, which provides a reconstructed voice signal. 

[044] To minimize the "musical noise," squeaks, squawks, chirps, clicks, drips, 
pops, low frequency tones, or other sound artifacts that may be generated in the low 

25 frequency range by some wind noise attenuators, an optional residual attenuator 714 may also 

be used. The residual attenuator 714 may track the power spectrum within a low frequency 
range. When a large increase in signal power is detected an improvement may be obtained 
by limiting the transmitted power in the low frequency range to a predetermined or calculated 
threshold. A calculated threshold may be equal to or based on the average spectral power of 

30 that same low frequency range at a period earlier in time. 

[045] Figure 1 1 is a flow diagram of a voice enhancement that removes some wind 
buffets and continuous noise to enhance the perceptual quality of a processed voice. At act 
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1 1 02 a received or detected signal is digitized at a predetermined frequency. To assure a 
good quality voice, the voice signal may be converted to a PCM signal by an ADC. At act 
1 104 a complex spectrum for the windowed signal may be obtained by means of an FFT that 
separates the digitized signals into frequency bins, with each bin identifying an amplitude and 
5 a phase across a small frequency range. 

[046] At act 1106, a continuous or ambient noise is measured. The background 
noise estimate may comprise an average of the acoustic power in each frequency bin. To 
prevent biased noise estimations at transients, the noise estimation process may be disabled 
during abnormal or unpredictable increases in power at act 1 108. The transient detection act 

10 1108 disables the background noise estimate when an instantaneous background noise 

exceeds an average background noise by more than a predetermined decibel level. 

[047] At act 1110, a wind buffet may be detected when the offset exceeds a 
predetermined threshold (e.g., a threshold > 3 dB) or when a high correlation exits between a 
best-fit line and the low frequency spectrum. Alternatively, a wind buffet may be identified 

15 by the analysis of time varying spectral characteristics of the input signal. When a line fitting 

detection method is used, the fitting of the line to the suspected wind buffet signal may be 
constrained by some optional acts. Exemplary optional acts may prevent a calculated offset, 
slope, or coordinate point in a wind buffet model from exceeding an average value. Another 
optional act may prevent the wind noise detection method from applying a calculated wind 

20 buffet correction when a vowel or another harmonic structure is detected. If a vowel or 

another harmonic structure is detected, the wind noise detection method may limit the wind 
buffet correction to values less than or equal to average values. An additional optional act 
may allow the average wind buffet model or attributes to be updated only during unvoiced 
segments. If a voiced or mixed voice segment is detected, the average wind buffet model or 

25 attributes are not updated under this act. If no voice is detected, the wind buffet model or 

each attribute may be updated through many means, such as through a weighted average or a 
leaky integrator. Many other optional acts may also be applied to the model. 

[048] At act 1112, a signal analysis may discriminate or mark the voice signal from 
the noise-like segments. Voiced signals may be identified by, for example, (1) the narrow 

30 widths of their bands or peaks; (2) the resonant structure that may be harmonically related; 

(3) their harmonics that correspond to formant frequencies; (4) characteristics that change 
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relatively slowly with time; (5) their durations; and when multiple detectors or microphones 
are used, (6) the correlation of the output signals of the detectors or microphones. 

[049] To overcome the effects of wind noise, a wind noise is substantially removed 
or dampened from the noisy spectrum by any act. One exemplary act 1114 adds the 

5 substantially linear wind buffet model to a recorded or modeled continuous noise. In the 

power spectrum, the modeled noise may then be substantially removed from the unmodified 
spectrum by the methods and systems described above. If an underlying peak or valley 902 
is masked by a wind buffet 202 as shown in Figure 9 or masked by a continuous noise, a 
conventional or modified interpolation method may be used to reconstruct the peak and/or 

10 valley at act 1 1 16. A time series synthesis may then be used to convert the signal power to 

the time domain at act 1 120, which provides a reconstructed voice signal. 

[050] To minimize the "musical noise," squeaks, squawks, chirps, clicks, drips, 
pops, low frequency tones, or other sound artifacts that may be generated in the low 
frequency range by some wind noise processes, a residual attenuation method may also be 

15 performed before the signal is converted back to the time domain. An optional residual 

attenuation method 1118 may track the power spectrum within a low frequency range. When 
a large increase in signal power is detected an improvement may be obtained by limiting the 
transmitted power in the low frequency range to a predetermined or calculated threshold. A 
calculated threshold may be equal to or based on the average spectral power of that same low 

20 frequency range at a period earlier in time. 

[051] Figures 12 and 13 are partial sequence diagrams of a voice enhancement. Like 
the method shown in Figure 11, the sequence diagrams may be encoded in a signal bearing 
medium, a computer readable medium such as a memory, programmed within a device such 
as one or more integrated circuits, or processed by a controller or a computer. If the methods 

25 are performed by software, the software may reside in a memory resident to or interfaced to 

the wind noise detector 102, a communication interface, or any other type of non-volatile or 
volatile memory interfaced or resident to the voice enhancement logic 100 or 700. The 
memory may include an ordered listing of executable instructions for implementing logical 
functions. A logical function may be implemented through digital circuitry, through source 

30 code, through analog circuitry, or through an analog source such through an analog electrical, 

audio, or video signal. The software may be embodied in any computer-readable or signal- 
bearing medium, for use by, or in connection with an instruction executable system, 

11 
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apparatus, or device. Such a system may include a computer-based system, a processor- 
containing system, or another system that may selectively fetch instructions from an 
instruction executable system, apparatus, or device that may also execute instructions. 

[052] A "computer-readable medium," "machine-readable medium," "propagated- 
5 signal" medium, and/or "signal-bearing medium" may comprise any means that contains, 

stores, communicates, propagates, or transports software for use by or in connection with an 
instruction executable system, apparatus, or device. The machine-readable medium may 
selectively be, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, 
or semiconductor system, apparatus, device, or propagation medium. A non-exhaustive list 

10 of examples of a machine-readable medium would include: an electrical connection 

"electronic" having one or more wires, a portable magnetic or optical disk, a volatile memory 
such as a Random Access Memory "RAM" (electronic), a Read-Only Memory "ROM" 
(electronic), an Erasable Programmable Read-Only Memory (EPROM or Flash memory) 
(electronic), or an optical fiber (optical). A machine-readable medium may also include a 

15 tangible medium upon which software is printed, as the software may be electronically stored 

as an image or in another format (e.g., through an optical scan), then compiled, and/or 
interpreted or otherwise processed. The processed medium may then be stored in a computer 
and/or machine memory. 

[053] As shown in the first sequence of Figure 12, a time series signal may be 

20 digitized and smoothed by a Harming window to provide an accurate estimation of a fully 

voiced, a mixed voice, or an unvoiced segment. The complex spectrum for the windowed 
signal is obtained by means of an FFT that separates the digitized signals into frequency bins, 
with each bin identifying an amplitude across a small frequency range. 

[054] In the second sequence, an averaging of the acoustic power in each frequency 

25 bin during unvoiced segments derives the background noise estimate. To prevent biased 

noise estimates, noise estimates may not occur when abnormal or unpredictable power 
fluctuations are detected. 

[055] In the third sequence, the unmodified spectrum is digitized, smoothed by a 
window, and transformed into the complex spectrum by an FFT. The unmodified spectrum 

30 exhibits portions containing noise-like segments and other portions exhibiting a regular 

harmonic structure. 
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[056] In the fourth sequence, a sound segment is fitted to separate lines to model the 
severity of the wind and continuous noise. To provide a more complete explanation, an 
unvoiced, fully voiced, and mixed voiced sample are shown. The frequency bins in each 
sample were converted into the power-spectral domain and logarithmic domain to develop a 
5 wind buffet and continuous noise estimate. As more windows are processed, the average 

wind noise and continuous noise estimates are derived. 

[057] To detect a wind buffet, a line is fitted to a selected portion of the signal in the 
SNR domain. Through a regression, best-fit lines model the severity of the wind noise in 
each illustration. A high correlation between one best-fit line and the low frequency 
10 spectrum may identify a wind buffet. Alternatively, a y-intercept that exceeds a 

predetermined threshold may also identify a wind buffet. To limit the masking of voice, the 
fitting of the line to a suspected wind buffet signal may be constrained by the rules described 
above. 

[058] To overcome the effects of wind noise, the modeled noise may be dampened 
15 in the unmodified spectrum. In Figure 13, the dampening of the wind buffets and continuous 

noise from the unvoiced and mixed voiced sample are shown in the fifth sequence. An 
inverse FFT that converts the signal power to the time domain provides the reconstructed 
voice signal. 

[059] From the foregoing descriptions it should be apparent that the above-described 
20 systems may condition signals received from only one microphone or detector. It should also 

be apparent, that many combinations of systems may be used to identify and track wind 
buffets. Besides the fitting of a line to a suspected wind buffet, a system may (1) detect the 
peaks in the spectra having a SNR greater than a predetermined threshold; (2) identify the 
peaks having a width greater than a predetermined threshold; (3) identify peaks that lack a 
25 harmonic relationships; (4) compare peaks with previous voiced spectra; and (5) compare 

signals detected from different microphones before differentiating the wind buffet segments, 
other noise like segments, and regular harmonic structures. One or more of the systems 
described above may also be used in alternative voice enhancement logic. 

[060] Other alternative voice enhancement systems include combinations of the 
30 structure and functions described above. These voice enhancement systems are formed from 

any combination of structure and function described above or illustrated within the attached 
figures. The logic may be implemented in software or hardware. The term "logic" is 
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intended to broadly encompass a hardware device or circuit, software, or a combination. The 
hardware may include a processor or a controller having volatile and/or non- volatile memory 
and may also include interfaces to peripheral devices through wireless and/or hardwire 
mediums. 

[061] The voice enhancement logic is easily adaptable to any technology or devices. 
Some voice enhancement systems or components interface or couple vehicles as shown in 
Figure 14, instruments that convert voice and other sounds into a form that may be 
transmitted to remote locations, such as landline and wireless telephones and audio 
equipment as shown in Figure 15, and other communication systems that may be susceptible 
to wind noise. 

[062] The voice enhancement logic improves the perceptual quality of a processed 
voice. The logic may automatically learn and encode the shape and form of the noise 
associated with the movement of air in a real or a delayed time. By tracking selected 
attributes, the logic may eliminate or dampen wind noise using a limited memory that 
temporarily or permanently stores selected attributes of the wind noise. The voice 
enhancement logic may also dampen a continuous noise and/or the squeaks, squawks, chirps, 
clicks, drips, pops, low frequency tones, or other sound artifacts that may be generated within 
some voice enhancement systems and may reconstruct voice when needed. 

[063] While various embodiments of the invention have been described, it will be 
apparent to those of ordinary skill in the art that many more embodiments and 
implementations are possible within the scope of the invention. Accordingly, the invention is 
not to be restricted except in light of the attached claims and their equivalents. 
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